Extracting URI Patterns from SPARQL Endpoints
نویسندگان
چکیده
Understanding the structure of identifiers in a particular dataset is critical for users/applications that want to use such a dataset, and connect to it. This is especially true in Linked Data where, while benefiting from having the structure of URIs, identifiers are also designed according to specific conventions, which are rarely made explicit and documented. In this paper, we present an automatic method to extract such URI patterns which is based on adapting formal concept analysis techniques to the mining of string patterns. The result is a tool that can generate, in a few minutes, the documentation of the URI patterns employed in a SPARQL endpoint by the instances of each class in the corresponding datasets. We evaluate the approach through demonstrating its performance and efficiency on several endpoints of various origins.
منابع مشابه
Discoverability of SPARQL Endpoints in Linked Open Data
Accessing Linked Open Data sources with query languages such as SPARQL provides more flexible possibilities than access based on derefencerable URIs only. However, discovering a SPARQL endpoint on the fly, given a URI, is not trivial. This paper provides a quantitative analysis on the automatic discoverability of SPARQL endpoints using different mechanisms.
متن کاملScalewelis: a Scalable Query-based Faceted Search System on Top of SPARQL Endpoints
This paper overviews the participation of Scalewelis in the QALD-3 open challenge. Scalewelis is a Faceted Search system. Faceted Search systems refine the result set at each navigation step. In Scalewelis, refinements are syntactic operations that modify the user query. Scalewelis uses the Semantic Web standards (URI, RDF, SPARQL) and connects to SPARQL endpoints.
متن کاملTowards Equivalences for Federated SPARQL Queries
The most common way for exposing RDF data on the Web is by means of SPARQL endpoints. These endpoints are Web services that implement the SPARQL protocol and then allow end users and applications to query just the RDF data they want. However the servers hosting the SPARQL endpoints restrict the access to the data by limiting the amount of results returned by user queries or the amount of querie...
متن کاملLD-VOWL: Extracting and Visualizing Schema Information for Linked Data Endpoints
Users currently face the problem that schema information for Linked Data is often not available. If it is available, it tends to be incomplete or does not adequately represent the data. It can therefore be hard for users to get an impression of the data provided by some Linked Data source. In this paper, we introduce LD-VOWL, a web-based tool that extracts and visualizes schema information of L...
متن کاملAn Empirical Study of Real-World SPARQL Queries
Understanding how users tailor their SPARQL queries is crucial when designing query evaluation engines or fine-tuning RDF stores with performance in mind. In this paper we analyze 3 million real-world SPARQL queries extracted from logs of the DBPedia and SWDF public endpoints. We aim at finding which are the most used language elements both from syntactical and structural perspectives, paying s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014